An approach to mining the multi-relational imbalanced database

نویسندگان

  • Chien-I Lee
  • Cheng-Jung Tsai
  • Tong-Qin Wu
  • Wei-Pang Yang
چکیده

The class imbalance problem is an important issue in classification of Data mining. For example, in the applications of fraudulent telephone calls, telecommunications management, and rare diagnoses, users would be more interested in the minority than the majority. Although there are many proposed algorithms to solve the imbalanced problem, they are unsuitable to be directly applied on a multirelational database. Nevertheless, many data nowadays such as financial transactions and medical anamneses are stored in a multi-relational database rather than a single data sheet. On the other hand, the widely used multi-relational classification approaches, such as TILDE, FOIL and CrossMine, are insensitive to handle the imbalanced databases. In this paper, we propose a multi-relational g-mean decision tree algorithm to solve the imbalanced problem in a multi-relational database. As shown in our experiments, our approach can more accurately mine a multi-relational imbalanced database. 2007 Elsevier Ltd. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning from Skewed Class Multi-relational Databases

Relational databases, with vast amounts of data–from financial transactions, marketing surveys, medical records, to health informatics observations– and complex schemas, are ubiquitous in our society. Multirelational classification algorithms have been proposed to learn from such relational repositories, where multiple interconnected tables (relations) are involved. These methods search for rel...

متن کامل

An efficient approach for effectual mining of relational patterns from multi-relational database

Data mining is an extremely challenging and hopeful research topic due to its well-built application potential and the broad accessibility of the massive quantities of data in databases. Still, the rising significance of data mining in practical real world necessitates ever more complicated solutions while data includes of a huge amount of records which may be stored in various tables of a rela...

متن کامل

Multi Relational Data Mining Approaches: A Data Mining Technique

The multi relational data mining approach has developed as an alternative way for handling the structured data such that RDBMS. This will provides the mining in multiple tables directly. In MRDM the patterns are available in multiple tables (relations) from a relational database. As the data are available over the many tables which will affect the many problems in the practice of the data minin...

متن کامل

Fuzzy multi-criteria selection procedures in choosing data source

Technology assessment and selection has a substantial impact on organizations procedures in regards to technology transfer. Technological decisions are usually made by a group of experts, and whereby integrity of these viewpoints to a single decision can be quite complex. Today, operational databases and data warehouses exist to manage and organize data with specific features and henceforth, th...

متن کامل

Multi-relational data mining in Microsoft SQL

Most real life data are relational by nature. Database mining integration is an essential goal to be achieved. Microsoft SQL Server (MSSQL) seems to provide an interesting and promising environment to develop aggregated multi-relational data mining algorithms by using nested tables and the plug-in algorithm approach. However, it is currently unclear how these nested tables can best be used by d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Expert Syst. Appl.

دوره 34  شماره 

صفحات  -

تاریخ انتشار 2008